19 research outputs found

    A closed-form approach to Bayesian inference in tree-structured graphical models

    Full text link
    We consider the inference of the structure of an undirected graphical model in an exact Bayesian framework. More specifically we aim at achieving the inference with close-form posteriors, avoiding any sampling step. This task would be intractable without any restriction on the considered graphs, so we limit our exploration to mixtures of spanning trees. We consider the inference of the structure of an undirected graphical model in a Bayesian framework. To avoid convergence issues and highly demanding Monte Carlo sampling, we focus on exact inference. More specifically we aim at achieving the inference with close-form posteriors, avoiding any sampling step. To this aim, we restrict the set of considered graphs to mixtures of spanning trees. We investigate under which conditions on the priors - on both tree structures and parameters - exact Bayesian inference can be achieved. Under these conditions, we derive a fast an exact algorithm to compute the posterior probability for an edge to belong to {the tree model} using an algebraic result called the Matrix-Tree theorem. We show that the assumption we have made does not prevent our approach to perform well on synthetic and flow cytometry data

    Deciphering the pathobiome: intra- and interkingdom interactions involving the pathogen erysiphe alphitoides

    No full text
    Plant-inhabiting microorganisms interact directly with each other, forming complex microbial interaction networks. These interactions can either prevent or facilitate the establishment of new microbial species, such as a pathogen infecting the plant. Here, our aim was to identify the most likely interactions between Erysiphe alphitoides, the causal agent of oak powdery mildew, and other foliar microorganisms of pedunculate oak (Quercus robur L.). We combined metabarcoding techniques and a Bayesian method of network inference to decipher these interactions. Our results indicate that infection with E. alphitoides is accompanied by significant changes in the composition of the foliar fungal and bacterial communities. They also highlight 13 fungal operational taxonomic units (OTUs) and 13 bacterial OTUs likely to interact directly with E. alphitoides. Half of these OTUs, including the fungal endophytes Mycosphaerella punctiformis and Monochaetia kansensis, could be antagonists of E. alphitoides according to the inferred microbial network. Further studies will be required to validate these potential interactions experimentally. Overall, we showed that a combination of metabarcoding and network inference, by highlighting potential antagonists of pathogen species, could potentially improve the biological control of plant diseases

    Deciphering the pathobiome: intra- and interkingdom interactions involving the pathogen Erysiphe alphitoides

    No full text
    Deciphering the pathobiome: intra- and interkingdom interactions involving the pathogen [i]Erysiphe alphitoides[/i]. 10. International symposium on phyllosphere microbiolog

    Language models can identify enzymatic active sites in protein sequences

    No full text
    Recent advances in language modeling have tremendously impacted how we handle sequential data in science. Language architectures have emerged as a hotbed of innovation and creativity in natural language processing over the last decade, and have since gained prominence in modeling proteins and chemical processes, elucidating structural relationships from textual/sequential data. Surprisingly, some of these relationships refer to three-dimensional structural features, raising important questions on the dimensionality of the information contained in sequential data. We demonstrate that the unsupervised use of a language model architecture to a language representation of bio-catalyzed chemical reactions can capture the signal at the base of the substrate-active site atomic interactions, identifying the three- dimensional active site position in unknown protein sequences. The language representation comprises a reaction-simplified molecular-input line-entry system (SMILES) for substrate and products, and amino acid sequence information for the enzyme. This approach can recover, with no supervision, 52.12% of the active site when considering co-crystallized substrate-enzyme structures as ground truth, vastly outperforming other attention-based models

    A protein coevolution method uncovers critical features of the Hepatitis C Virus fusion mechanism.

    Get PDF
    Amino-acid coevolution can be referred to mutational compensatory patterns preserving the function of a protein. Viral envelope glycoproteins, which mediate entry of enveloped viruses into their host cells, are shaped by coevolution signals that confer to viruses the plasticity to evade neutralizing antibodies without altering viral entry mechanisms. The functions and structures of the two envelope glycoproteins of the Hepatitis C Virus (HCV), E1 and E2, are poorly described. Especially, how these two proteins mediate the HCV fusion process between the viral and the cell membrane remains elusive. Here, as a proof of concept, we aimed to take advantage of an original coevolution method recently developed to shed light on the HCV fusion mechanism. When first applied to the well-characterized Dengue Virus (DENV) envelope glycoproteins, coevolution analysis was able to predict important structural features and rearrangements of these viral protein complexes. When applied to HCV E1E2, computational coevolution analysis predicted that E1 and E2 refold interdependently during fusion through rearrangements of the E2 Back Layer (BL). Consistently, a soluble BL-derived polypeptide inhibited HCV infection of hepatoma cell lines, primary human hepatocytes and humanized liver mice. We showed that this polypeptide specifically inhibited HCV fusogenic rearrangements, hence supporting the critical role of this domain during HCV fusion. By combining coevolution analysis and in vitro assays, we also uncovered functionally-significant coevolving signals between E1 and E2 BL/Stem regions that govern HCV fusion, demonstrating the accuracy of our coevolution predictions. Altogether, our work shed light on important structural features of the HCV fusion mechanism and contributes to advance our functional understanding of this process. This study also provides an important proof of concept that coevolution can be employed to explore viral protein mediated-processes, and can guide the development of innovative translational strategies against challenging human-tropic viruses

    Learning ecological networks from next-generation sequencing data

    No full text
    Species diversity, and the various interactions that occur between species, supports ecosystems functioning and benefit human societies. Monitoring the response of species interactions to human alterations of the environment is thus crucial for preserving ecosystems. Ecological networks are now the standard method for representing and simultaneously analyzing all the interactions between species. However, deciphering such networks requires considerable time and resources to observe and sample the organisms, to identify them at the species level and to characterize their interactions. Next-generation sequencing (NGS) techniques, combined with network learning and modelling, can help alleviate these constraints. They are essential for observing cryptic interactions involving microbial species, as well as short-term interactions such as those between predator and prey. Here, we present three case studies, in which species associations or interactions have been revealed with NGS. We then review several currently available statistical and machine-learning approaches that could be used for reconstructing networks of direct interactions between species, based on the NGS co-occurrence data. Future developments of these methods may allow us to discover and monitor species interactions cost-effectively, under various environmental conditions and within a replicated experimental design framework

    BIS uncovers a coevolving signal between E1 and the Stem region that regulate HCV fusion.

    No full text
    <p>(<b>A</b>) Position of the three amino acid residues that differs between H77 (blue) and A40 (red) and are hypothesized to coevolve according to BIS prediction (gt1a cluster 5). The three H77 amino acids will be replaced by A40 residues individually or altogether to challenge BIS prediction. (<b>B</b>) Impact of the E1/Stem coevolution signal on HCV entry. Infectious titers of HCVpp viral particles harboring H77 (blue), A40 (red) and H77/A40 E1E2 chimera were determined. Two E1 H77 residues (S112, I117), a single H77 E2 residue (D462) or both (S112, I117, D462) were introduced into E1E2 A40. The different envelopes were incorporated at the surface of HCVpp, then used to infect Huh7.5. Infectious titers were quantified 72h post infection by flow cytometry (mean ± SD; n = 3). *<i>p</i><0.05, ns non-significant. (<b>C</b>) H77/A40 E1 and E2 chimera expression and incorporation onto HCVpp. Expression in transfected 293T cells (Cell lysates) and incorporation onto concentrated pseudoparticles (Viral Pellets) of E1 and E2 from the different H77/A40 chimera. Detection of E1 and E2 onto pseudoparticles harboring no envelope glycoproteins was used as negative control. MLV-Capsid (CA) was detected to control equivalent HCVpp production between chimera. (<b>D</b>). Impact of the E1/Stem coevolution signal on HCV fusion. LTRhiv-luciferase vector transduced 293T cells expressing the different E1E2 H77/A40 chimeric envelope glycoproteins were co-cultured with Tat-expressing Huh7.5 cells. Co-cultured cells were exposed to an acid shock (pH5, orange) or not (pH7, red) and luciferase activities were determined 72h post-exposure. Results are presented in relative light units (RLU) for each experimental condition (mean ± SD; n = 3). *<i>p</i><0.05, ***<i>p</i><0.001.</p

    BIS as a methodology to decrypt virus entry mechanisms.

    No full text
    <p>Schematic representation of the experimental approach employed in this study, from BIS computational analysis to the design and challenge of a mechanistic model of viral fusion. Following sequence analysis, matrix of E1E2 amino-acid coevolution were generated by BIS for different HCV genotypes. Plotting of matrix coevolution networks onto E2core structure unveiled a potential scenario of E1 and E2 rearrangements during HCV fusion, which involved the BL domain of E2. At the protein domain level, the construction of a soluble form of the BL and the conduction of several experimental assays supported such hypothesis. In parallel, at the amino acid level, the experimental validation of coevolution signals between specific residues of E1 and of the BL highlighted the critical role of E1-BL networks in regulating fusogenic rearrangements (and more generally, the critical role of coevolving networks between E1 and E2 C-terminal regions). Altogether, this approach allows us to propose a HCV fusion model where BL movements and E1 refolding are critical in the induction of E1E2 interdependent, fusogenic rearrangements. By being applicable to other viral proteins and viruses, such approach provides opportunities to uncover undescribed viral-mediated mechanisms and design innovative translational strategies for their inhibition.</p

    BIS analysis of dengue E-Pr coevolving residues.

    No full text
    <p>(<b>A</b>) Tridimensional representation of DENV Pr (Black, PDB 3C6R) and E (multi-color, PDB 1K4R). A linear representation of the PrM-E polyprotein is depicted below the protein structures. Starting and ending residue positions of each protein (Pr, M and E) and E domain are indicated. E domains are annotated by distinct colors: DI, domain I (red); DII, domain II (yellow); DIII, domain III (blue); Tmd, transmembrane (black). (<b>B</b>) Organization and positions of the PrM-E cluster 2 (orange), 7 (blue) and 9 (pink) on tridimensional representations of the DENV E and Pr proteins. Cluster 2 and 9 are depicted on a dimeric or trimeric E-Pr structure respectively at low pH condition (PDB 3C6R). Cluster 7 is depicted on a trimeric E-Pr structure at neutral pH condition (PDB 3XIY). Linear representation of the two proteins are also depicted on the top of each structure and cluster block location are indicated (precise cluster positions are reported in <b><a href="http://www.plospathogens.org/article/info:doi/10.1371/journal.ppat.1006908#ppat.1006908.s003" target="_blank">S1 Table</a></b>). The close proximity between DENV E and Pr cluster 2 blocks is enlarged.</p
    corecore